Networks can provide communication paths to access a plurality of electronic information. One example of such a network is the Internet, which can provide communication paths to access a plurality of web sites. The web sites can be formed of a number of individual web pages that are linked together. With the proliferation of web pages, determining the similarity of various web pages can be useful. Similar web pages can include identical web pages, and may include some web pages that are non-identical. Determining identical web pages can be a straight forward process. Determining whether non-identical web pages are similar can be more challenging.